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SECTION  I 


ABSTRACT 


2UEUEING  NETWORK  ANALYSIS  IN  SOFTWARE  PHYSICS 

L.  M.  Traister 
April,  1979 


Queueing  network  models  have  achieved  a  prominent  place  in  the  modelling 
of  computer  system  performance.  Recent  formulations  of  operational 
methods  by  Buzen  and  others  have  greatly  facilitated  their  practical  use. 
On  the  other  hand,  Kolence's  software  physics  has  addressed  the  problem 
of  appropriate  metrics  for  both  performance  and  capacity  of  processors 
and  systems  and  for  the  workloads  with  which  they  interact. 


This  paper  integrates  the  methods  of  operational  queueing  network  analysis 
into  an  extension  of  software  physics.  A  significant  feature  of  this 
approach  is  that  parameters  of  the  equipment  and  of  the  workload  are  not 
confounded  in  the  analysis  but  are  kept  distinct,  finally  combining  in 
the  model  computations  proper.  The  objectives,  principles  and  assumptions 
are  stated  and  the  basic  quantities  are  defined.  The  fundamental  opera¬ 
tional  laws  and  certain  of  the  algorithms  are  derived  and  illustrated 
with  examples. 
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SECTION  II 


BACKGROUND  AND  OBJECTIVES 

1.0  THE  BACKGROUND  OF  OPERATIONAL  ANALYSIS 

Queueing  network  analysis  is  a  means  of  arriving  at  the  quantities 
that  describe  the  performance  of  systems  of  limited  resources.  It 
provides  understanding  of  how  these  systems  react  to  the  demands 
made  upon  them.  Until  recently,  the  queueing  analysis  methodology 
for  computing  systems  has  been  more  or  less  a  direct  adaptation  of 
that  for  the  traditional  applications  in  that  the  performance  quan¬ 
tities  of  response  time,  waiting  time,  server  utilizations  and  so 
on  are  described  by  statistical  distributions  resulting  from  the 
statistical  descriptions  of  the  "customer"  arrival  rates  and  server 
service  rates.  Frequently  this  stochastic  approach  leads  to  formu¬ 
lations  of  much  complexity,  often  with  little  hope  of  thorough 
solution.  Moreover  the  focus  has  been  observed  to  be  more  on  the 
pursuit  of  nice,  closed  mathematical  solutions,  with  less  regard 
to  providing  a  reasonably  accurate  assessment  for  the  practical 
situation.  As  Newell  [NEWE71]  has  commented,  the  situation  has 
nearly  become  one  of  "solutions  in  search  of  a  problem. " 

More  recently,  however,  there  has  been  work  which  maintains  touch 
with  the  practical  situation  both  in  its  formulation  of  relation¬ 
ships  by  insisting  on  testability  and  in  its  use  of  quantities 
which  must  be  measurable  in  an  implemented  system.  Buzen,  in 
particular  has  argued  for  and  developed  such  an  approach  and 
named  it  Operational  Analysis .  He  has  through  its  development 
created  a  useful  approach  to  performance  analysis  which  is  more 
readily  comprehended  and  put  into  practice. 

2.0  THE  BACKGROUND  OF  SOFTWARE  PHYSICS 

Kolence  [KOLE76]  has  provided  us  with  an  approach  to  quantification 
and  measurement  of  computer  system  attributes  that  is  intimately 
connected  with  performance.  These  are  the  concepts  and  metrics 
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for  software  work,  software  and  hardware  execution  times  and  the 
derived  quantity  of  software  power.  The  demand  that  a  specific 
function  places  on  equipment  is  measured  in  terms  of  work  require¬ 
ment;  the  capacity  of  the  equipment  to  respond  is  measured  in  terms 
of  power,  that  is,  work  divided  by  time.  In  addition  to  all  this 
he  has  provided  a  hierarchical  structure  for  system  architecture 
that  permits  natural  decompositions.  These  structural  properties 
are  explored  in  some  detail  in  [K0VA79].  What  is  of  importance 
for  us  here  is  that  in  using  the  software  physics  work  demand  and 
equipment  power  quantities  rather  than  the  more  traditional  quanti¬ 
ties  (in  queueing  analysis)  of  service  time  and  visit  ratios,  we 
are  able  to  keep  properties  of  the  workload  (work  demand  and  dis¬ 
tribution)  and  properties  of  the  equipment  (software  power)  dis¬ 
tinct  in  expression,  so  that  the  impact  of  each  is  separately 
visible  and  each  is  separately  manipulable. 

3.0  THE  OBJECTIVE  OF  THIS  EFFORT 

What  we  have  set  out  to  realize  in  this  current  work  is  the  uniting 
of  the  operational  approach  to  computing  systems  performance 
analysis  with  the  structural  concepts  and  metrics  of  software 
physics.  By  doing  so,  we  hope  at  least  to  establish  a  basis  for 
a  natural,  engineering  approach  to  operational  queueing  analysis 
for  computing  systems.  Some  of  the  benefits  of  this  approach  will 
be  encountered  in  this  paper,  such  as  the  isolation  of  workload  and 
equipment  properties  discussed  above  or  the  simple  way  in  which 
work  distribution  can  be  specified.  In  other  cases  we  observe  that 
just  the  translation  of  conventional  forms  into  software  physics 
notation  can  be  revealing  in  isolating  the  parameters  which  affect 
performance.  But  we  happily  expect  that  this  is  just  a  beginning; 
that  having  put  these  formulations  to  use  over  extended  periods, 
we  will  gain  better  insight  not  only  into  the  systems  that  they 
describe  but  into  the  extension  of  the  theory  itself,  thus  giving 
am  independent  life  to  this  mode  of  formulation  and  perception. 


SECTION  III 


REQUIREMENTS,  PRINCIPLES  AND  ASSUMPTIONS 

3.0  INTRODUCTION 

We  discuss  here  some  of  the  requirements  for  a  methodology  for 
the  modelling  and  validation  of  computing  system  performance 
in  which  theory  and  practical  application  can  be  integrated. 
Furthermore,  we  state  the  basic  principles  and  assumptions 
which  support  a  straightforward  and  tractable  development  of 
the  integrated  methodology. 

3.1  REQUIREMENTS  -  MODELLING  AND  VALIDATION 

Briefly  stated,  our  objective  is  to  provide  mathematical  entities 
which  characterize  the  performance  of  computer  systems,  that  is, 
a  mathematical  model.  A  fundamental  requirement  to  our  develop¬ 
ment  is  that  all  hypotheses  be  capable  of  verification  on  the  real 
system  being  modelled,  a  concept  which  Buzen  calls  operational 
testability .  Unfortunately,  the  traditionally  invoked  methodology 
of  stochastic  modelling  does  not  meet  our  fundamental  requirement. 
This  is  because  the  basic  assertion  there  is  that  the  probability 
distributions  of  stochastic  processes  govern  or  characterize  the 
behavior  of  real  systems,  an  assertion  that  cannot  be  proved 
by  measurement.  Furthermore,  even  though  stochastic  methods  are 
based  on  probabalistic  considerations,  their  use  describes  only 
the  variability  (in  the  sense  of  uncertainty)  of  the  quantities 
which  are  the  parameters  of  the  model,  the  process  tells  us  nothing 
about  the  future  values  of  those  same  quantities.  Thus,  the  problem 
of  parameter  value  prediction,  as  Buzen  points  out  [BUZE77b]  is 
the  same  for  both  operational  and  stochastic  methods  and  here  one 
might  make  use  of  probabalistic  methods  among  others.  However, 
this  is  an  issue  separate  from  that  of  which  type  of  model  to  employ. 
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Concerning  the  parameterization  of  the  models,  we  express  a 
preference  for  formulations  developed  around  those  same  quantities 
which  we  already  use  to  characterize  the  workload  or  the  capability 
of  processors  in  a  more  general  context.  This  naturally  leads  us 
to  a  choice  of  formulations  based  on  work  demands  made  on  processors 
by  jobs  or  transactions  (which  we  will  term  interactions) .  The 
capability  of  the  processors  to  service  those  demands  will  be 
described  in  terms  of  work  performed  per  unit  time  or  processor 
power.  These  quantities  are  in  fact  those  of  established  software 
physics,  which  indeed  characterizes  workloads,  processors  and  many 
of  their  interactions.  It  furthermore  has  an  experimental  basis 
providing  methodology  for  measurement  of  many  of  these  same  quanti¬ 
ties. 

In  the  subsequent  development,  then,  we  will  seek  to  express  para¬ 
meters  of  our  models  only  in  terms  of  established  software  physics 
entities  or  their  simple  extensions.  In  order  to  facilitate  clear 
analysis,  we  wish  to  keep  distinct  those  quantities  which  describe 
the  workload  and  those  which  describe  the  equipment.  Note  that  the 
quantities  of  conventional  queueing  analysis  do  not  generally  pre¬ 
serve  this  distinction.  For  example,  the  conventional  "service 
time"  quantity  as  in  regard  to,  say,  a  disk  direct  access  processor, 
derives  from  the  action  of  a  workload  requirement  (block  length) 
applied  to  an  equipment  capability  (access  time  and  read/write  rate) . 
The  quantities  of  software  physics  also  carry  with  them  clear  speci¬ 
fication  for  their  measurement  in  a  computing  system  context.  These 
include  the  quantities  that  are  the  model  parameters  and  those  that 
describe  performance. 

3.2  PRINCIPLES  AND  ASSUMPTIONS 

Although  we  will  present  these  in  context  as  needed,  it  is  useful 
to  indicate  at  once  some  of  the  basic  principles  and  assumptions 
which  will  shape  our  operational  methodology. 


3.2.1  Flow  Balance 


This  conservation  principle  states  that  every  arriving  unit  of 
work  demand  to  a  configuration  (system)  or  subconfiguration 
(subsystem  or  processor)  is  matched  by  a  corresponding  completion 
of  work  there.  For  finite  time  intervals  this  statement  only 
approximates  true  behavior.  Related  to  this  is  another  conserva¬ 
tion  principle  concerning  states  of  the  system,  that  of  state 
transition  balance,  which  we  discuss  later. 

3.2.2  Overlap  of  Subconfigurations 

We  assume  that  no  single  interaction  (job  or  transaction)  overlaps 
its  use  of  disjoint  subconfigurations .  In  particular,  when  we 
consider  the  subconfiguration  to  be  a  processor,  a  single  inter¬ 
action  is  present  at  only  one  processor  at  any  given  instant  in 
time.  Note  that  this  depends  on  some  strict  definitions  of  "job" 
or  "transaction”  since,  for  example,  all  work  could  be  considered 
as  part  of  the  one  job  called  "the  operating  system. "  We  will 
provide  such  definition  later  under  the  discussion  of  the  "software 
unit"  entity  of  software  physics. 

3.2.3  Single-Step  Behavior 

This  assumption  asserts  that  if  we  resolve  time  finely  enough 
we  will  observe  changes  in  the  system  only  at  a  single  pair  of 
processors,  appearing  as  the  movement  of  a  request  from  one 
processor  to  another.  Thus  we  eliminate  from  consideration  the 
simultaneous  "movement"  of  requests  in  the  system. 

3.2.4  Processor  Homogeneity 

This  assumption  states  that  the  output  rate  (power)  of  a  processor 
is  determined  only  by  its  queue  length  and  is  otherwise  unaffected 
by  the  arrangement  of  work  elsewhere  in  the  system.  This  assumption 
is  violated  in  practice  to  some  extent  by  the  "blocking"  of  one 
processor  by  another.  For  example  as  when  a  disk  unit  must  wait  for 
a  buffer  to  be  cleared  by  a  cpu  or  when  occasionally  a  disk  utilizing 
RPS  (rotational  position  sensing)  cannot  connect  to  its  channel 
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for  transmission  of  data,  this  due  to  the  use  of  the  channel 
by  another  disk  unit.  The  homogeneity  assumption  implies  then 
that  the  processor  is  busy  if  there  is  work  waiting  for  it.  The 
stochastic  counterpart  to  this  assumtption  is  that  interdeparture 
times  at  a  device  are  exponentially  distributed.  In  conventional 
operational  analysis,  the  counterpart  is  termed  device  honogeneity, 
but  here  we  wish  to  anticipate  the  possibility  of  a  subconfiguration 
of  equipment  being  considered  a  processor. 

3.2.5  Routing  Homogeneity 

This  assumption  states  that  the  proportion  of  work  arriving  to 
the  system  that  is  routed  to  a  given  processor  may  depend  only 
on  the  multiprogramming  level,  that  is,  the  concurrent  number  of 
interactions  in  the  system.  Thus  the  arrangement  of  work  elsewhere 
in  the  system  or  length  of  queue  at  the  processor  itself  does  not 
affect  the  proportion  of  work  routed  there.  This  assumption,  for 
example,  allows  for  increased  work  at  a  paging  device  with  increased 
multiprogramming  level.  The  stochastic  counterpart  of  this  assump¬ 
tion  is  that  job  routing  follows  an  ergodic  Markov  chain. 

3.2.6  Decomposition 

This  principle  states  that  when  state  transitions  between  nested 
subsystems  are  small  in  number  compared  to  the  number  of  transi¬ 
tions  within  the  composite,  we  may  reasonably  well  replace  the 
contained  subsystem  with  an  equivalent  processor.  The  character¬ 
istics  of  this  equivalent  processor  are  to  be  determined  from  a  study 
in  isolation  of  the  contained  subsystem.  We  will  make  use  of 
this  principle  later  in  the  analysis  of  an  on-line  terminal  system. 


3.2.7  Invariance  Assumptions 


These  are  not  specific  assumptions  but  rather  a  type  of  assumption. 
When  a  performance  analyst  assumes  explicitly  or  otherwise  that 
the  change  of  a  given  workload  or  processor  parameter  will  not 
change  any  other  parameters,  the  assumption  is  being  made  that 


those  parameters  are  invariant  under  the  change .  Very  often  it 
is  safe  to  make  such  an  assumption,  but  there  are  exceptions. 

For  example,  if  through  priority  setting,  jobs  of  one  class,  say 
A,  are  given  preference  over  jobs  of  another  class,  say  B,  the 
average  multiprogramming  level  of  class  B  jobs  may  decrease  if 
the  number  of  class  A  jobs  are  increased.  Thus  it  is  suggested 
that  these  ar -umptions  be  made  explicit  so  that  unexpected  results 
in  model  validation  can  be  effectively  researched  and  explained. 
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SECTION  IV 

FUNDAMENTAL  QUANTITIES 

4.0  INTRODUCTION 

This  section  presents  the  software  physics  quantities  which  are 
the  basis  for  formulating  the  laws  and  relationships  subsequently 
given.  These  quantities  are  either  primary  (directly  observable) 
or  derived  and  depend  on  time,  work  or  the  logical  interconnection 
of  equipment.  They  are: 

(1)  Properties  of  the  Equipment  and  Implementation  described  as 
the  presence  of  processors  with  potential  performance  charac¬ 
teristics. 

(2)  Properties  of  the  logical  Equipment  Topology.  The  description 
of  configurations  in  software  physics;  queueing  networks. 

(3)  Work  demand  presented  to  the  configuration  by  a  specific 
task  software  unit. 

(4)  Quantities  which  describe  or  predict  the  performance  of  the 
system  with  respect  to  the  tasks  defined  by  work  demand. 

Our  mode  of  formulation  of  laws  and  relationships  will  generally 
allow  expression  in  terms  of  distinct  quantities  from  each  of  the 
above  groups . 

4 . 1  NOTATION 

Software  Physics  quantities  are  given  in  the  functional  notation: 

where  the  factor  $  is  a  quality  or  property  and  V „  and  7„  are  each 

1  a 

lists  of  designators.  The  list  V1  designates  the  software  unit, 
that  is,  the  program  and  data  of  the  specific  execution  of  a  task 


or  set  of  tasks.  The  list  V „  designates  the  logical  subconfigura¬ 
tions  in  the  system.  Software  units,  configurations  and  logical 
subconfigurations  are  more  fully  discussed  in  [KOLE76]  and  [KOVA79] . 

(software  units)  we  will  use 

-  The  full  workload. 

Any  specific  software  unit  (or  member  of  a 
collection  of  software  units)  as  designated 
by  the  discussion. 

(equipment)  we  will  use: 

The  full  configuration. 

These  designate  processors  (devices)  or 
collections  of  them  which  are  characterized 
by  tree  structures. 

Devices  may  be  named  directly  as  needed  in 
7  descriptions  of  the  configuration. 

As  the  property  Q  in  general  describes  qualities  that  depend  on 
the  containment  of  one  software  unit  by  another  or  a  piece  or 
class  of  equipment  by  another,  the  lists  are  made  to  describe 
these  relationships  by  given  the  contained  unit  on  the  left  and 
the  containing  unit  on  the  right.  Thus  if  o  3 S then 

Q(SvS;x) 

is  the  property  Q  of  the  equipment  x  operating  with  software  unit 
5^  relative  to  the  execution  of  the  containing  unit  S. 

As  an  example,  the  common  notion  of  the  utilization  of  a  device 
X  is  denoted  by: 


For  the  list  7, 

5  or 

For  the  list  7„ 

X;  °r  X,J 

apu) 
disk) 
tape ) 
eta. ) 


which  is  the  time  of  execution  of  ;<  with  respect  to  that  of  h  on 
behalf  of  all  software  units  contained  in  the  fuil  workload. 

But 

••  'o  -  •  v  ■'  1 

is  the  utilization  (full  conditional)  of  the  same  equipment  with 
its  time  of  execution  counted  only  when  it  is  on  behalf  of  the 
software  unit  5. 

Finally,  it  is  convenient  to  denote  the  collection  of  values  over 
all  equipment  classes  of  the  property  in  an  array  or  vector  form. 
Thus 

%_'■  -i'P) 

denotes  the  array  with  elements  ,  etc.  where  the 

collection  of  constitute  the  full  configuration  'h  or  some 
specifically  designated  set  of  equipment  classes. 

A  null  position  indicated  in  the  list  V„  by  a  dot,  indicates  that 
the  elements  of  the  array  are  the  values  of  *  in  contained  equip¬ 
ment  relative  to  containing  equipment. 

Thus 

if  d;  • ,  ip) 

is  the  array  of  elements  P'  >  etc.  with  the 

designations  of  the  x„-  either  left  arbitrary  or  defined  by  the 
context. 

4.2  BASIC  PROPERTIES 

The  quantities  of  time  and  work  are  fundamental  to  the  description 
of  the  occurrence  of  events  and  performance  in  an  executing  system. 


4.2.1  Execution  Time :  7x(S;x) 


This  quantity  measures  the  active  time  for  the  software  unit 
designated  by  S  on  the  subconfiguration  (equipment)  designated 
by  X-  may  thus  be  thought  as  a  clock  which  runs  only  when 
the  designated  software  unit  and  equipment  are  active.  This 
definition  encompasses  both  the  "busy  time"  and  "observational 
interval"  of  conventional  operational  analysis,  but  differs  in 
that  unlike  the  conventional  formulation: 

i)  Time  may  be  counted  on  behalf  of  specific  software  units 
as  well  as  the  full  workload. 

ii)  TxfLjii) ,  the  execution  time  of  the  full  workload  on  the 

full  configuration  does  not  count  idle  time  (no  constituent 
active) ,  whereas  the  conventional  "observation  interval" 
may  do  so. 

Note  also  that  the  reference  to  "busy  time"  above  was  to  that 
as  defined  in  operational  analysis  and  does  not  include  time 
that  a  device  is  blocked  (e.g.,  a  disk  in  seek  or  RPS  delay). 
These  points  are  further  discussed  in  [KOVA79] . 


4.2.2  Work :  r^(S;x) 

This  quantity  in  software  physics  describes  events  accomplished 
or  demanded.  Work  performed  is  operationally  defined  to  be 
equal  to  the  number  of  bytes  (character  containers  of  8  bits) 
transferred  to  or  from  a  processor  memory.  Thus  one  unit  of 
work  is  performed  for  each  byte  written  to  a  device  (or  memory) . 
The  unit  of  work  has  been  designated  the  "WORK",  denoted  by  'J. 

As  with  execution  time,  work  is  counted  on  behalf  of  the  software 
unit  and  equipment  designated  by  o  and  X/  respectively. 


Thus,  for  example,  W(S;epu)  measures  work  performed  by  the  cpu 
(bytes  transferred  to  a  memory)  on  behalf  of  the  software  unit  S. 


Note  that  performing  a  unit  of  software  physics  work  represents 
an  event  at  a  more  elementary  level  than  does  the  "request"  of 
standard  queueing  theory  or  operational  analysis.  The  grouping 
of  work  into  requests  or  blocks  will  be  considered  to  be  a 
property  of  the  implementation. 


4.3  EQUIPMENT  AND  IMPLEMENTATION  PROPERTIES 

The  quantities  in  this  group  describe  the  potential  work  perform¬ 
ance  capability  of  equipment  and  the  implementation  parameters 
on  which  this  capability  depends.  Processors  or  collections  of 
them  are  frequently  referred  to  as  configurations,  or  more  generally 
subeon  figurations .  The  topological  properties  of  equipment  inter¬ 
connections  is  discussed  in  the  next  section. 


4.3.1  Absolute  Power:  P('JXh  (or  P(\) ,  P(S;\) 

This  is  a  derived  quantity  of  software  physics  defined  by: 


h'hix.1 .. 
Tx(';x) 


This  expression  states  that  any  work  performed  by  the  device  or 
configuration  denoted  by  X  is  to  be  divided  by  the  time  X  is 
active.  Absolute  power  may  be  developed  using  the  conditions 
existing  for  a  specific  software  unit  in  which  case  we  define 
accordingly: 


P(S;x) 


-  w(s ^ 

'  Tx(S;x ) 


Implementation  parameters  may  be  included  in  the  list  following 
X  if  desired  to  indicate  that  a  specific  value  of  say,  disk 
seek  time,  or  record  size  was  used  for  requests  on  behalf  of  S. 


Absolute  power  at  the  configuration  level,  i.e., 


PfLjipJ 


Tx(Ljip) 


can  be  interpreted  as  overall  system  throughput  on  a  byte  basis 
in  that  it  gives  the  rate  of  work  completion  in  the  system  as  a 
whole  as  measured  by  the  elapsed  time  clock. 
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4.3.2 


Blocksize  Work:  Wb(S;\) 


This  quantity  is  an  implementation  parameter  which  specifies 
the  quantity  of  work  performed  each  time  a  specific  interaction 
visits  the  subconfiguration  X-  it  thus  corresponds  to  the  work 
per  request  when  "request"  is  defined  to  be  an  unbroker,  sequence 
of  demands  at  the  device  on  behalf  of  one  job  or  transaction 
user.  So  it  is  the  amount  of  work  done  by  the  device  on  behalf 
of  the  software  unit  S,  uninterrupted  by  the  execution  of  another 
device  on  behalf  of  that  same  software  unit. 

4.4  PROPERTIES  OF  THE  EQUIPMENT  TOPOLOGY 

We  consider  here  two  conceptions  of  the  way  in  which  the  logical 
interconnection  of  equipment  is  described.  The  first  is  that  of 
conventional  queueing  network  analysis  and  describes  how  the 
processors  form  a  conceptual  network  as  a  consequence  of  the  way 
requests  are  routed  in  the  system.  From  this  we  develop  the  Buzen 
Central  Server  Models  (BCSM)  for  batch  and  terminal  driven  systems. 

The  second  conceptualization  is  that  of  the  logical  configurations 
and  suboon figurations  of  software  physics.  This  mode  of  descrip¬ 
tion  visualizes  the  equipment  and  the  paths  to  it  as  forming  rooted 
trees.  The  useful  properties  derived  from  this  conception  include 
the  natural  way  in  which  work  demand  for  a  task  can  be  partitioned 
in  conformance  with  the  structure  of  the  logical  systems.  This 
structuring  takes  advantage  of  inclusion  properties  and  so  forms 
the  basis  for  analysis  using  natural  system  decompositions. 

4.4.1  Queueing  Networks  and  the  Routing  of  Requests 

A  queueing  network  is  a  conceptualization  of  how  the  demands  of 
interactions  enter  the  system,  circulate  from  device  to  device 
(having  received  service  after  possibly  entering  a  queue)  and 
finally  complete.  This  last  event  is  represented  by  showing 
that  the  interaction  leaves  the  system.  The  transitions  from 
device  to  device  or  to  exit  are  associated  with  probabilities 
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to  be  zero,  bat  if  it  is  significant  as  in  the  case  when  a 
long  transmission  line  is  in  the  network,  one  coin  substitute 
a  device  having  a  power  which  gives  the  appropriate  delay  time. 

Two  network  configurations  of  special  interest  are  those  based 
on  Buzen's  control  server  model  (BCSM) .  These  are  so  named 
because  one  device  (the  cpu)  acts  in  a  pivotal  capacity.  That 
is,  an  interaction  always  makes  a  request  visit  to  the  cpu 
before  and  after  making  a  request  of  any  other  device  or  exiting 
the  system.  This  is  depicted  in  Figure  4.4.2  which  is  the  batch 
central  server  system.  In  Figure  4.4.3  the  central  server  forms 
the  computer  subsystem  for  a  terminal  driven  system.  In  this 
latter  configuration,  M  terminal  users  are  signed  on  and  submit 
interactions  to  the  central  subsystem.  When  processing  is 
complete,  the  user  spends  Z  units  of  "think  time"  before  entering 
the  next  interaction.  During  the  processing  interval,  the  user's 
terminal  is  blocked,  that  is,  it  can  enter  no  other  interactions. 
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Figure  4.4.3 

Terminal  System  with  Central  Server 


4.4.2  Configurations  and  Subconfigurations  in  Software  Physics 

The  software  physics  conception  of  the  logical  structure  of  compu¬ 
ting  systems  is  that  of  rooted  trees  formed  by  the  graph  union  of 
all  possible  data  paths  that  may  occur  in  the  course  of  some  execu¬ 
tion.  A  full  discussion  of  this  concept  and  these  properties  is 
given  in  [K0VA79] ;  we  shall  only  give  a  description  of  a  few  of 
these  properties  here. 

The  concept  of  the  configuration  is  built  on  the  notion  of  the 
conventional  graph  of  equipment  interconnection ,  an  example  of 
which  is  given  in  Figure  4.4.4. 

Augmentations  are  next  made  to  develop  logical  subcon  figurations . 
This  is  done  by  the  operation  of  graph  composition  which  is  the 
union  of  instantaneous  paths  to  drives  plus  the  intrusion  of  a 
highest  level  mode,  if  one  does  not  already  exist.  For  example 
in  order  to  form  the  tape  logical  subconfiguration,  one  forms  the 
union  of  all  paths  to  tape  drives  including  the  channel  and  con¬ 
troller  devices  for  them  and  inserts  a  node  labeled  "TAPE." 

Examples  are  shown  for  both  disk  and  tape  logical  subconfigurations 
in  Figure  4.4.5. 


DISK 


TAPE 


THE  DISK  SUBCONFIGURATION  THE  TAPE  SUBCONFIGURATION 

EQUIPMENT  CLASS  SUBCONFIGURATIONS 

Figure  4.4.5 
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Most  important  is  that  each  node  defines  a  relative  root  which 
has  the  upper  lattice  property,  that  is,  every  node  contains  or 
covers  the  properties  of  all  nodes  dependent  on  it.  So  for 
example,  in  Figure  4.4.6  we  observe  that  the  "CHAN  2"  logical 
subconfiguration  is  both  a  tape  and  a  disk  channel  subconfigura¬ 
tion.  In  the  same  figure,'^  is  the  entire  input/output  logical 
subconfiguration,  y  is  the  cpu  logical  subconfiguration . 

The  description  of  configurations  by  these  rooted  tree  structures 
will  be  useful  to  us  in  describing  overall  how  work  is  distributed 
to  equipment.  It  is  also  a  natural  means  for  dealing  with  the 
decomposition  of  systems  into  subsystems  and  so  has  application 
in  conventional  queueing  networks  having  subsystems  as  servers. 


CPU  CPU 


4.4.3  The  Configuration  and  the  Bulk  Distribution  of  Work 

Recall  that  the  symbol  for  an  arbitrary  configuration  is  X* 

When  we  need  to  indicate  containment  of  one  subconfiguration  by 
another  we  do  this  by  use  of  one  or  more  asterisks  (*)  in  the 
subscript  of  X-  The  asterisk  represents  a  string  of  one  or  more 
subscript  values.  In  a  given  context,  if  we  use  the  asterisk  to 
represent  a  string  of  subscripts ,  we  must  remember  that  we  are 
referring  to  that  same  string  whenever  an  asterisk  occurs.  Addi¬ 
tional  asterisks  in  the  subscript  thus  represent  additional 
levels  of  subscripting  (depth  in  the  structure)  to  that  indicated 
by  the  first.  Thus  x*  refers  to  a  subconfiguration  at  least  one 
level  deeper  than  x  and  X)t*  is  at  least  one  level  deeper  than  X* 
and  so  on.  If  we  write 

l  F(SsX,X*.) 

A 

u 

then  we  are  summing  the  F's  for  all  subconfigurations  that  are  at 
least  two  levels  below  x  etc.  Hierarchies  of  software  units  may 
be  handled  in  the  same  way. 


Software  physics  describes  the  demands  made  on  subconfigurations 
and  processors  in  terms  of  the  quantity  of  work  that  each  performs. 
However,  we  will  need  to  refer  to  quantities  of  work  demand  arrivin 
to  the  system  denoted  by  the  vector : 


Va(S;\l>)  =  {Wa(S;\J}  Wa(S;\J, 


•  •  •  | 


where  the  {x„- }  represent  a  partition  of  >p;  that  is:  x„-  O  X,-  -  0 


i  /  j 


The  ratios 


Wa(S;xJ 

D(S;xi}^i)  =  Wa(s.^) 


*=  t/x. 

J 


{X-}  a  partition  of 


are  called  the  bulk  v/oik  distributions  of  the  arriving  work  and 
are  denoted  by  the  vector : 

DfS;  0  D(S;x9,  ty),"'} 
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Mote  that  we  cannot  yet  say  anything  about  the  measured  work 
performed  at  the  relates  to  the  arriving  Wa(S;X^.)  .  This  must 
wait  for  a  conservation  assumption  we  will  make  presently. 


A  specific  quantity  of  arriving  work  demand  is  that  related  to 
a  single  or  mean  interaction  (job,  transaction,  or  the  like) . 
This  vector  quantity  is  denoted: 

Vd(Sjip)  =  {Wd(S;Xj)>  Wd(S;xk,)}---} 

Again,  the  magnitude  is  given  by 

|  Wd(Sj'ji)  |  =  WdfSjipJ  =  l  Wd ( S; x,. , 

■7 

Now  for  a  given  software  unit  5  (an  interaction  or  collection 
of  interactions) : 


Wd  (  o  i  x  • )  mcl  (Si  X„-  ( 

1/  __  _ U 

!v’d(S;4>J  Wa(S;ty) 


D(SJX;,W 


since  the  Wa  and  Wd  measure  the  same  things  but  over  different 
time  bases. 


In  our  present  discussion  we  will  use  the  X„-  as  if  they  were 
devices,  remembering  that  they  can  just  as  well  be  proper  subconfig¬ 
urations.  What  we  leave  for  a  subsequent  analysis  is  how  we  may 
in  general  substitute  a  subconfiguration  for  collections  of  devices 

(or  contained  subconfigurations) .  We  will  thus  be  concerned  with 
work  distributions  of 

Wa(S;<p)  =  Wa(Sjip)-DfSjip)  =  l  Wa(S;\i)  'D(S;ip) 

i 

and 

TWd(S;ty)  =  Wd(SjipJ-D(SjipJ  =  l  Wd(S;xJ'D(S;ty) 

where  the  x^  are  processors  or  their  equivalents  in  the  system. 

The  above  distributions  of  work  are  on  a  bulk  basis.  That  is, 
they  show  how  work  distributes  on  the  average  to  the  devices  (or 
subconfigurations) .  It  will  develop  later  that  these  will  suffice 
as  workload  specification  in  the  calculation  of  the  other  descrip¬ 
tive  quantities  we  seek. 
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It  is  also  possible  to  formulate  distribution  numbers  for  proper 
subconfigurations.  This  is  done  thoroughly  in  [KOVA79] ,  so  we 
only  give  an  indication  here.  If  the  quantity  of  work  arriving 
to  a  proper  subconfiguration  configuration  is  Wa(S;x*'  then, 

Z =  - •  ~ ; - j —  is  the  bulk  distribution  of  work 

for  the  subconfiguration  X*  to  x** 

4 . 5  PERFORMANCE  QUANTITIES 

These  quantities  tell  of  the  rates  at  which  the  given  conf iguration 
is  processing  the  required  workload  and  consequently  of  the  (response) 
time  that  we  must  wait  for  an  interaction  to  complete.  Intimately 
connected  with  these  quantities  are  the  uii  lizctior.s  cf  devices 
which  indicate  how  much  of  the  device  capacity  we  use  on  behalf  of 
some  specific  interactions  or  the  total  workload  and  the  lengths 
of  queues  at  the  device. 

4.5.1  Relative  Power:  P(S;}  X-  also  ?(S;x*tXj  Ft'SjZj’bJ 
This  quantity  is  most  generally  defined  as 


and  gives  the  rate  of  performing  work  on  behalf  of  the  software 
unit  5.  on  the  subconfiguration  x*  relative  to  the  clock  which 
counts  time  when  5  and  X  are  active.  It  is  sometimes  called 
throughput  power  for  it  gives  the  relative  rate  of  work  unit  request 
completions.  This  rate  must  not  be  confused  with  the  notion  of 
"service  rate”  from  conventional  queueing  analysis.  This  latter 
quantity  is  the  request  completion  rate  relative  to  the  clock 
which  is  active  only  when  the  server  is  busy  and  is  therefore 
more  like  the  software  physics  quantity  of  absolute  power. 


The  quantity 


•>( 5  -  .,u  =  ,v  1  — 

TxiLsy} 


called  the  software  conditional  relative  power,  may  be  interpreted 
as  the  overall  work  level  throughput  for  the  software  unit  J.  It 
is  the  work  completion  rate  on  behalf  of  5  as  measured  by  the 
system  elapsed  time  clock. 


Notice  that 


l 


where  (x„-}  is  a  partition  of  -jj.  So  the  software  conditional 
relative  power  is  the  sum  of  the  fully  conditional  relative  powers 
over  a  partition  of  the  configuration. 


4.5.2  Utilization :  U\ 'S\,S;x*jX^  also  V(3;\A,x) 

This  quantity  is  defined  most  generally  as: 


7z(S  x*) 

_ r _  s  -  c 

Tx(Sjx)  o  “ 

and  gives  the  ratio  of  time  that  the  software  unit  S.  is  active 
on  subconfiguration  X.«  to  the  time  the  containing  software  unit 
S  is  active  on  a  containing  subconfiguration.  The  term  "utilization 
encountered  in  conventional  parlance  is: 


U(LiXiA) 


TziLiXj) 

Tx(Ljip) 


That  is,  the  ratio  of  the  unconditional  execution  time  on  processor 
X ^  to  the  execution  time  of  the  full  workload  on  the  full  config¬ 
uration. 


4.5.3  The  System  State  :  Vectors:  Wh :J3)  =  ,7.  ,  ••  •  ,Wk'  ]  (S) 

or 


±(3)  = 

Scalars:  .Vi'Jj-iJ 


The  vector  quantity  shows  how  the  work  or  requests  in  the  system 
for  software  unit  3  are  distributed  at  some  instant  in  time  to 
each  of  the  <  devices  (or  subconfigurations).  Thus  W.  units  of 
work  are  in  queue  or  in  service  at  device  i  in  the  work  formulation. 
Similarly  n.  requests  are  in  queue  or  in  service  at  device  ^  in 
the  request  formulation.  Note  that  the  scalar  total  waiting  work 


* 

K 

ivtf(5;ty)  =  T  W.(S) 
i=l  v 

or  in  terms  of  requests 

k 

N(S;m)  =  l  *.,(S) 

i=l  ' 

The  work  and  request  forms  of  these  quantities  are  related  by 

( S )  -  Wb(  SjX „•)  •n.(S) 

>  Lf  'i. 

where  Wb  is  the  average  processor  blocksize  work  for  the  software 
unit  S. 


4.5.4  System  Throughput:  P(SiL;'\>)  also  X-(S) 

J 

System  Response  Time  :  Tb(S;ty) 

Under  the  conditions  of  the  conservation  of  work  is  the  configura¬ 
tion  the  software  conditional  relative  power  for  the  entire  con¬ 
figuration 

P(S,L;ty) 

may  be  interpreted  as  throughput  on  a  work  level  basis  for  the 
entire  system.  This  means  that  for  a  sufficiently  extended  period, 
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quantity 
is  given  by: 


Txi L;'4>)  the  rate  of  interaction  completion  (this 

.> 

measured  externally  as  jobs  or  transactions  completed) 


=  W(S;ty)  .  1 

WafSj'pJ  TxCLjipJ 


= 

Wd(S;\\i) 

System  response  time  is  that  busy  time  measured  by  the  clock 
at  level  ip  for  an  interaction  to  enter  the  system,  make  visits  to 
the  servers  (and  wait  in  queues,  if  necessary)  and  finally  exit 
the  'p  configuration.  It  is  the  time  that  the  'p  configuration  is 
busy  with  (including  delay  in_  t|>)  an  5.  £  S  interaction.  It  is 
identical  to  the  response  time,  3,  of  conventional  operational 
analysis. 


SECTION  V 


LAWS  AND  RELATIONS 


S.O  INTRODUCTION 

We  now  present  some  relations  between  the  defined  quantities  of 
the  last  section.  All  the  demonstrations  are  operational  in  the 
sense  that  no  stochastic  assumptions  are  made  and  that  the  quanti¬ 
ties  are  to  be  obtained  from  direct  measurement  over  finite  obser¬ 
vation  intervals.  Included  are  immediate  relations  between  the 
fundamental  quantities  as  well  as  work  flow  relations  derived  from 
consideration  of  entire  systems  in  balanced  flow. 

5 . 1  General  Laws  and  Relations 

These  are  valid  in  both  balanced  and  non-balanced  states  of  a 
system,  that  is,  they  depend  only  on  work  having  arrived  to  a 
server  and  not  on  any  conservation  principles  over  the  network  as 
a  whole. 


5.1.1  Utilization  Law 

From  the  definitions  for  utilization 

Tx(S;x-) 

U(5,L;xi}\l))  =  Tx(L.ty} 

and  for  relative  and  absolute  powers,  we  have: 

w(S;xJ  V(SjxJ  Tx(S;xJ 
P(S,L;xi,'\>)  =  Tx(L.ty)  ~  fx(S;xJ  *7x(S;ty) 

from  which 


P(S}L;xv^  =  PiSiX^'UfS^iX^) 

and  so 

P(S,L;X. fl) 

U(S,L;x-ify)  -  "57Q - p —  (utilization  law) 

when  S  =  L,  we  have  more  simply: 


U(L;xis'l>) 


P(P;xi,'^>) 


-28- 


5.1.2  Little's  Law 


Enormously  useful  in  the  analysis  of  queueing  behavior  is  Little's 
Law,  so  named  for  it  was  proved  under  very  general  conditions  by 
J.D.C.  Little  in  1961.  However,  it  existed  for  many  years  before 
that  as  what  Kleinrock  calls  a  "folk  theorem."  It  relates  response 
time  of  a  system  (or  device  or  queue  itself)  to  the  amount  of  work 
waiting.  The  relationship  is: 


mS;X  J  « 


Ww(S;x^ 


(5.1.1) 


where  Tu (Sj x„* '  is  the  mean  waiting  time  at  subconfiguration  X_- 

✓  u 

and  WwfSjXj)  is  the  mean  amount  of  work  waiting  for  or  in  service 
at  X„-  • 


The  conventional  operational  analysis  formulation  is : 

n . 

•7,  — 

R.  =  —  where  n.  is  the  mean  number  of  recueszs  waiting 
t  Xv  ^ 

for  or  in  service  at  device  i  and  X  •  is  the  request 

V 

completion  rate. 


To  show  the  software  physics  formulation  of  this  result,  consider 
that  we  observe  and  count  the  work  waiting  for  or  in  service 
(called  the  waiting  work)  as  a  function  of  time. 


5k. 

4k. 


Vu(S;XJ  3k_ 

(bytes) 

2k- 
lk  I 


-) - 

1  2 


Figure  5.1.1 

Waiting  Work  at  A  Subconfiguration 
(Example) 


Now  if  we  let  .4 X  - ^  bs  the  area  under  the  graph  of  waiting  work, 
we  write : 


Tx(  L;i>) 

A(£;X;)  =  l  Ww(S;X;)M 
t=0 


The  average  height  of  the  graph  is: 


>vZJ  ( 0  j  X  • ' 


.4rS;X^' 


This  is  the  average  amount  of  work  in  the  system.  Note  that  we 
use  the  elapsed  time  Tx(Z;'v)  as  our  clock. 


Now  the  average  completion  rate  is  5  i’SjX  •  J  •  the  work  done  divided 
by  the  elapsed  time  or 


ivYS;x  J 

■$(S;XJ  =  Tx(L  ,ii) 


So  the  average  time  that  a  unit  of  work  demand  spends  in  the 
system  is : 


Tu)(S;xJ 


Ww(S;x^  Trfw(S;X;) 

WsJZJ~  = 


Notice  that  TutS^X;)  is  not  a  rate  per  unit  of  work  but  a  t:~e. 
The  relationship  tells  us  nothing  about  hew  the  work  arrives  to 
or  departs  from  the  system.  In  particular,  if  work  arrives  in 
batched  rscuests  and  is  served  under  a  first-come-f irst-served 
(FCFS)  discipline,  all  unit  work  demands  in  the  request  wait  for 
their  service  aoncurrently . 


As  an  example,  consider  the  graph  of  Figure  5.1.1.  The  accumulated 
waiting  time  over  the  10  second  period  is  19Kw  seconds.  So  the 
average  waiting  work  Whj(S;x.)  is  1.9  Kw.  The  relative  power  is 
the  work  completed  divided  by  the  same  10  seconds,  i.e., 
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M(S;Xj)  s  x 

?(3,L;xJ>)  =  Tz('lJ\1))  =  ~  ~TcT~  =  600  W//sea- 

(there  were  6  -  1KB  completions) 
Consequently  the  mean  waiting  time  at  x„-  is 


7  g  Xu 

Tu(S;x-J  ~  '•  ■' /  or  about  3.17  seconds. 

z  oOO  */ s 


5.2  FLOW  BALANCE  AND  THROUGHPUTS 

We  now  discuss  a  work  conservation  principle  at  the  configuration 
and  system  state  transition  levels.  This  principle  is  called  one 
of  flow  balance  for  as  a  consequence  processing  rates  of  individual 
subconfigurations  are  related  to  each  other  and  to  the  interaction 
level  rate  of  processing  (called  the  throughput) . 

5.2.1  Configuration  Work  Flow  Balance 

At  a  macroscopic  level  we  observe  the  arrival  of  a  vector  of  work 
demand  for  a  set  of  interactions  Wat'Sjty)  and  now  wish  to  establish 
a  relationship  between  this  quantity  and  the  work  performed  inside 
the  system.  What  we  require  is  that  for  large  enough  values  of 
T x(L;ty)  (elapsed  time) ,  the  performed  work  of  a  subconfiguration 
should  be  very  nearly  the  same  as  the  total  observed  to  arrive  times 
the  bulk  distribution  quantity  for  that  subconfiguration.  That  is: 

\W(S;X.)  -  WafSsW'DfSsXj* ^  I  «  AYS;XJ 

u  U  Is 

for  any  subconfiguration  (or  device)  where 


Wa(S;ty)  =  for  any  partition  (x*)  of  ^ 


So,  approximately: 


| A !(S;Xj)  -  Wa(SiW'D(S;x,;V)  I  =  0 
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This  conservation  assumption  leacis  us  to  the  principle  of 

configuration  work  flow  balance  : 

Wa(S;ty)  =  W(S;xi}  1 
7x(L;\p)  Tz(L;\\>)  D(5;Xj,ty) 


?(3,L;xA) 


(5.2.1a) 


?a(S,L;'p)  =  — jjjj —  for  any  subconfiguration  X„- 


Since 


I  XSsyL-W  =  i 

i  ^ 


l  P(S,LiXi>i')  =  P(S,Ljip) 
i 


we  may  rewrite  the  right  hand  side  of  (5.2.1a)  without  a  subscript 


giving: 


P(S}Ljip)  - 


P(S,L;xi,^>) 

D(S;xiA) 


(5.2.1b) 


for  any  subconfiguration  X^- 

We  may  now  interpret  PfS^L;^  as  the  work  level  throughput  of  the 
configuration  and  the  P(S,L;x„A as  the  work  level  throughputs 
of  the  To  develop  a  throughput  at  the  interaction  level  we 

recall  that  each  interaction  requires  work: 


Wd(Sjip)  =  Iwd(Sjx-) 
i 

Multiplying  (5.2.1b)  on  both  sides  by  this  amount  of  work  we  obtain 

P(S,L;x-,'\> ) 


P(S,L;\p)-Wd(S;ty)  = 


Wd(S;x,) 


P(S,L;x-,ty> 

XJS)  =  rrT7^-. - f - 

0  V d(S;x J 


(5.2.2a) 


(5.2.2b) 


for  any  subconfiguration  x*  where  Xn is  the  interaction  level 

Z-  u 

throughput  for  interactions  belonging  to  5  and  x*  is  any  subconfigu' 
ration  (or  device)  belonging  to  ip. 


Equations  (5.2.2a,b)  tell  us  what  device  level  throughput  must 
be  achieved  at  each  device  in  order  to  sustain  a  configuration 
work  throughput,  P(S}L;>p) ,  or  interaction  throughput,  \r(51 , 
and  so  are  called  the  Forced  Flow  Laws. 


5.2.2  Flow  Balance  in  General  Queueing  Networks 

In  traditional  formulations  of  queueing  networks,  work  demands 
for  a  given  job  are  imagined  to  circulate  from  device  to  device 

with  routing  determined  by  the  a_.  A  and  job  entry  or  exit  is 

‘13 

thought  of  as  from  or  to  device  zero.  Conservation  of  Transition 
equations  are  written  at  each  node  to  express  flow  balance. 

That  is,  for  a  request  size  Wb(S;x„-) 


W(SsXj)  WSiXf) 

Wb(S;X.)  =  Wb(S;x7)  qij 


(5.2.3a) 

3  =  0,"',k 


This  is  an  expression  of  the  fact  that  the  number  of  requests 

completed  at  X  ,*  is  the  same  as  the  total  number  which  arrived 
J 

from  all  sources  connected  (in  the  operational  sense)  to  J . 
From  the  definition  of  relative  power,  we  get  after  dividing 
both  sides  of  (5.2.3a)  by  Tx(L; ipj  : 


P(S,L;xvi>)  k  P(S,LiXi*'l>) 
Vb(S;Xj)  ~  f=Q  wfS;x  J 

whence : 


(5.2.3b) 

3  =  0,'">k 


< 

=  l  P(S,L;x.-jM 
3  i=0  ^ 


VbfSiXj) 
'Wb( S;x. 


a .  . 

'13 


(5.2.3c) 

3  =  0>"  ' >k 


We  note  that  the  quantity  P(S,L;Xg>'l>)  and  Wb(S;Xg )  represent  rates 
and  events  at  the  external  interaction  level  and  are  given  by: 

k 

?(S,l;x0,V  =  P(S,L;i>)  =  l  P(SiLixi,^) 

i=l 

< 

Wb(S;x0 )  =  Wd(3;ty)  *  [•  Wd(S;x, J 

i=l 
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The  expressions  (5.2.3c)  are  called  the  Flow  Balance  Node 


Equations.  The  quantity 

k 

l  P(S,L;X;,W 
_  if-  _  : 

l  Wd(S;XJ 

is  just  the  external  interaction  throughput  X,i'^  when  Tx(Z;\p) 
is  sufficiently  large. 

In  Buzen's  operational  analysis  [DENN78] ,  the  ratios  of  device 
to  external  interaction  throughput  are  defined  as  visit  ratios  by 


where  the  .Yy  are  the  device  request  completion 
rates. 


Here  the  analogous  quantities  are  obtained  by  taking  the  ratios 
of  device  relative  power  with  the  software  conditional  ty-level 
power,  that  is: 


V(S;x,) 


P(S,L;Xi 

F(S,L;4>)~  ' 


W(S;Xi) 

W(S;'l> ) 


D(S;Xi,\\>) 


(5.2.4a) 


Equation  (5.2.4a)  shows  that  each  byte  of  work  completed  by 
requires  D(S;X.,ty)  bytes  completed  by  X,  that  is,  V  is  a  "per 
byte"  visit  ratio.  For  a  completion  at  the  interaction  level  we 
observe  a  visit  ratio: 


V*(S;X{)  =  V(S;Xi)-Wd(S;i\>)  =  l>(3;Xi,ty) -Wd(S^) 

=  Wd(S;XJ  (5.2.4b) 

That  is,  the  quantity  Wd(S;XJ  ,  the  work  demand,  is  the  same  quantity 
as  the  number  of  visits  for  the  transaction  or  job  event. 


Now,  Equation  5.2.3b  can  be  divided  by  P(S,L;i\>)  on  each  side  to 
give : 


D(S;x.,ty)  - 


Wb(S;X  .) 
Wd  ( S; 


a  .  +  l  D(S;X.jtyJ  7 

‘0,7  . L  .  ^  l  ■' 


Wb(3;X.) 


%(S;X.)  qij 


(5.2.5) 


J  =  0,'"  ,k 
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Equations  (5.2.5)  are  called  the  Work  Distribution  Node  Equations 
and  are  analogous  to  the  Visit  Ratio  Equations  of  conventional 
Operational  Analysis.  It  is  important  to  note,  however,  that  the 
right  hand  side  of  (5.2.5)  keeps  workload  quantities  (the  distribu¬ 
tion  numbers)  and  implementation  quantities  (the  blocksize  work  and 
routing  frequencies)  separate,  thereby  suggesting  analyses  where 
each  may  be  varied  independently.  The  independence  of  these 
quantities  is  valid  only  approximately  as,  in  reality,  the  distri¬ 
bution  numbers  may  be  altered  to  some  extent  by  changes  in  the 
blocksize  work  (such  as  when  cpu  overheads  are  reduced  by  increases 
in  blocksizes) . 

As  a  practical  matter,  we  should  observe  that  even  if  the  distri- 
bution  numbers  and  blocksize  work  quantities  are  known,  the  f K+l 
routing  frequencies  cannot  be  found  in  the  general  network  since 
the  equations  (5.2.5)  are  only  K+l  in  number.  Fortunately  we  will 
not  be  required  to  do  so,  for  our  invocation  of  the  product  from 
solution  later  on  will  require  us  to  provide  only  the  interaction 
work  demand  (or  distribution  numbers)  and  the  absolute  subconfigu¬ 
ration  powers. 

The  main  use  of  the  Work  Distribution  Node  Equations  is  in  fact 
to  prove  the  validity  of  the  product  form  solution.  So  for  the 
distribution  numbers  themselves,  and  as  a  consequence  to  interac¬ 
tion  work  demand  values,  they  may  be  derived  from  an  analysis  of 
the  workload  or  approximated  by  measurement  of  the  workload  in 
execution. 

5.2.3  A  Special  Case  -  The  Buzen  Central  Server  Model  (BCSM) 

It  turns  out  that  for  the  BCSM  (see  Figure  4.4.2) ,  the  Flow 
Balance  and  Work  Distribution  Node  Equations  (5.2.3c  and  5.2.5) 
simplify  greatly  and  can  be  solved  for  the  Note  first  that: 


*01  =  1 
qu =  1 

o 

II 

Cr 

i  r  2 

H 

Cv 

i  ?  2 

Now  the  Flow  Balance  Equations  are : 

-  P(SsL;x0 sty)  =  PfSsLjx^'ii)  ^(Srx-J  a~1G 
k  "■■Wb(S;x1) 

?(S3L;x7,\l>)  =  l  ?(S}L;x.}'ii)  rz 
1  i=0  -  * 


M(S;x,) 


Wb(S;x  ■) 

P(S,lJXitV  =  P(S,L;x2A)  MfSsxj  qij 


Substituting 

P(S,L;x.,ty) 

pfs‘L-*>  *  W5-I-'V*J  * 


—  9  •  •  •  I* 

“i  j A- 


we  get : 


2  = 


Wb(oi^0)  _  Wd(S;ty) 
Wb(SiX2)  q10  ~  Wb(S;x7  q10 


(5, 


(5 


(5, 


(5. 


Wb(S;xJ  k  Wb(S;xJ 

D(3;xr  tl>)  =  wd(S;]p)  +  l  D(3;Xii'P) 

Is—o  "L 


(5. 


Wb(S;x J 

z 


D(S,xiSi>)  D(S;x1,'i>)  wb(s.x^)  <tH  1  -  2>'">k 

And  from  these  we  get  the  routing  frequencies: 


(5. 


Wb(SsXj) 


q10  Wd(Sjty)  where  -  l  Wd(S;x A) 


{X*l  a  partition  of  iji 


D(S;xi} V)  Vb(S;x2) 

qli  =  D(Ssx13V)mHb(S;xi) 


=  2,‘",k 


2.6a) 

■2.6b) 

2.6c) 

2.7a) 

2.7b) 

2.7c) 
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Notice  that  substituting  this  last  expression  for  q..  into  (5.2.6c) 
gives  the  same  result  obtained  from  the  Configuration  Work  Flow 


Balance  Principle,  namely: 


=  ?(S,L;\ P)  = 


HsJxTT^l 


Although  these  Configuration  Work  Flow  Balance  properties  hold  in 
any  node  flow  balanced  network,  they  are  sufficient  for  the 
derivation  of  the  Node  Flow  Balance  Equations  only  in  networks 
which  preserve  the  rooted  tree  type  structures  of  software  physics 
The  BCSM  does  in  fact  do  this,  if  one  allows  the  cpu  configuration 
to  include  the  input/output  devices  as  a  subconfiguration.  We  can 
therefore  directly  write  the  node  balance  equations  for  the  BCSM: 


D(S;x1A) 


(5.2.8) 


5.3  RESPONSE  TIME  LAWS 


5.3.1  General  Response  Time 

If  we  apply  Little's  Law  to  a  configuration  as  a  whole,  we  may 
write  the  response  time  for  a  single  byte  of  work  demand  in  an 
interaction  when  there  are  .¥  interactions  active  for  software 
unit  5 : 


N 

?(S,L;\ \>) 


(5.3.1) 


Since 

N  =  l  n.  n1  ( S) 

J 


and  multiplying  numerator  and  denominator  of 

n 


rb(s =  [ 


V 


'D(3iXjA) 


(5.3.1)  by  distribution 
numbers  D(SsXjh‘1>'}  • 


=  l  t[(Six.)(n.))‘D(S;x.,$)  (5.3.2) 

J  *  v 

with  ]  being  the  per  byte  queue  plus  service  time  at  the  indicated 
subconfiguration 


What  this  says  in  words  is  that  the  formulated  scheduling  here  is 
such  that  each  byte  of  interaction  work  demand  for  each  server  exper¬ 
iences  a  delay  plus  service  time  equal  to  that  when  there  are  t  _. 

V 

bytes  i for  software  unit  5)  in  the  queue. 

For  an  entire  interaction  with  work  demand  Wd(3;'yi  : 

-rbi'S.Sty)  ~  I  t  [  (S;xj)  1  *D(S;'n.,'v)iid(S;ty) 

*  <J  v  v 


■KS;x  ;>  J]*Jv'czY5;x,J 


(5.3.3) 


Again,  we  may  interpret  (5.3.3)  as  giving  the  total  response  time 
in  terms  of  a  "homogenized"  product  of  a  response  time  per  byte 
(dependent  on  the  multiprogramming  level  at  each  server)  and  byte 
level  visits  per  interaction.  This  formulation  may  be  interpreted 

as  for  a  round  robin  (RR)  scheduling  algorithm  approximating  processor 
sharing. 

Returning  to  Equation  (5.3.2)  and  multiplying  by  the  work  per 
interaction,  Wd(S;ty) ,  and  by  unity  in  the  form  of  Wb (Sjx_.)/Wb(c;\4) : 

Kj  V 


Vb(S.rt)  =  l 


Xb(SjXj) 

.,%)  'Wd(S;\  J  — S- 

J"  -  'V  «o(Ss\.) 


n  .'Wk(3; v  . /  Wd 'S; ;< . ) 


=  J  Tv(S;x  J  •rq-rt — *-r 
i  *b(z;x-) 


(5.3.4b) 


For  the  right  hand  side  of  Equations  (5.3.4b),  the  first  factor  is 

the  request  waiting  time  at  X  _•  when  there  are  .  requests  there  and  the 

t/  t 

second  term  is  the  number  of  times  requests  for  an  S.  £  5  appear 
at  x.w  that  is,  the  visit  ratio  of  conventional  operational 

V 

analysis.  Equations  (5.3.4)  give  precisely  the  same  waiting  time 
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(interaction  response  time)  as  does  (5.3.3),  differing  only  in 
the  multiplication  by  unity  and  the  rearrangement  of  terms. 

This  form  of  the  equation  for  Tb  can  be  interpreted  as  for 

a  f irst-come-first-served  (FCFS)  scheduling  algorithm  at  the 
resuest  level,  each  request  being  of  size  Wb  i  . 

Interactive  Response  Time 

By  application  of  Little's  Law  to  an  interactive  system  we  can 
derive  an  expression  for  the  response  time  for  an  interaction 
(here  a  transaction) .  Referring  to  Figure  5.3.1,  we  note  that  the 


outside  observer  "sees"  the  time  for  an  interaction  to  make  a 
complete  circuit  as  TbCS.jty)  +  Z,  where  Z  is  the  "think"  time  of 
each  signed-on  user.  That  is,  if  the  terminals  and  the  central 
system  are  configuration  ' ,  then  the  observer  measures: 

Tb(S^;ty’)  -  7b(S.;ty)  +  Z 


If  the  number  of  concurrent  users  (terminals  signed  on)  is  .’•/  then 
we  have  by  Little's  Law: 


M 


=  ?cz,Lirrwd(W) 

M 


(5.3.5a) 


(5. 3.5b) 


Or  as  seen  by  the  terminal  subsystem: 


rbcjty)  =  - 


xo(I) 


=  A?« 


Wd(I ;x„J 

?(I,L;)Ca) 


any 


(5.3.6a' 


X„.  £  P  (5.3.6b) 


Equations  (5.3.6)  are  called  the  Interactive  Response  Time 
Formulas . 


Note  that  since  we  must  have : 

TbfljpJ  >  1/XQ(I): 

it  follows  that: 

I 

M  -  ZXQ(I)  >  1  (5.3.7) 
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SECTION  VI 


MULTIPROGRAMMING  LOAD  analysis 


6 . 0  INTRODUCTION 

In  this  section  we  will  demonstrate  queueing  network  analysis  for 
configurations  which  are  of  the  type  having  product  form  solutions. 
These  separable  queueing  networks  were  originally  studied  as  sto¬ 
chastic  exponential  server  entities  by  Jackson,  Gordon  and  Newell 
and  others.  Buzen  and  Denning  [BUZE77]  have  demonstrated  similar 
properties  for  networks  with  "operational"  assumptions.  Reiser 
[REIS79]  points  out  that  queueing  networks  with  product  form  solu¬ 
tions  are  robust  with  regard  to  routing  and  service  time  distribu¬ 
tions,  that  is,  it  is  the  mean  values  that  dominate  the  solution. 

We  will  investigate  both  approximate  and  exact  methods  for  closed 
systems.  The  approximate  method  is  based  on  a  consideration  of 
asymptotic  behavior  and  has  been  called  "bottleneck  analysis”  by 
Buzen,  a  particularly  appropriate  term,  for  it  is  the  behavior  of 
the  bottleneck  device  in  a  separable  queueing  network  which  acts 
as  the  limiting  resource.  The  method  is  most  useful  for  quickly 
determining  the  effectiveness  of  changes  to  the  equipment  or  work 
demands  under  light  or  heavy  levels  of  multiprogramming.  The 
exact  methods  implemented  through  efficient  algorithms  provide 
throughput  functions  for  queueing  networks  at  any  level  of  multi¬ 
programming  and  as  a  consequence  provide  the  basis  for  an  appeal¬ 
ing  technique  for  the  analysis  of  terminal  driven  systems. 

6.1  BOTTLENECK  ANALYSIS 

In  this  section  we  show  how  throughput  in  closed  systems  varies  with 
increasing  multiprogramming  load  under  the  assumptions  that  work 
demand  and  subconfiguration  powers  remain  invariant. 


6.1.1  The  Bottleneck  Subconfiguration 


Since 


?fS,Z;x„yP 

p^JxP 


yfs;x<) 

MElE 

?(3;x J 


we  have  the  ratio  of  utilizations  for  X;*'Xj: 


U(S,L;x. ,-i>) 


W(S;xJ  ?x(S;'W?(S;x.) 

TxfSs^PfSsx,)  MSJxJ 

d 

W(S;X,)  ?(3;xd 

•s  a 


(6.1.1) 


Since  the  ratio  of  utilizations  is  expressed  in  terms  of  load 
invariant  quantities  (by  assumption) ,  the  ratio  itself  is 
invariant  with  multiprogramming  load.  Also,  the  subconfigura¬ 
tion  (say  x<)  with  the  largest  value  of 


VfS;x„J 

utu- — ^~r  ,  over  some  time  period  Tx(S;x>) 

r ( *5 j  '  * 


has  the  highest  utilization  under  any  system  load.  It  will  thus 
be  the  first  subconfiguration  to  attain  100%  utilization  when 
the  load  is  sufficiently  heavy.  From  the  forced  flow  law  applied 
to  this  device ,  we  observe : 

V{S,L;x^^ 

xo(s)  =  -wdTsltp- 


P(SiXi)  3(S;Xi) 

'  Wd(S;xi)'U(S,L;XiA)  "■  Wd(S;X;) 


(6.1.2) 


as  V(StLiXitV  -  2 

Equation  (6.1.2)  shows  that  system  throughput  is  limited  by  the 
value 


P(S;X£) 

Vd(S;xJ 


as 


the  x^  subconfiguration 


saturates ,  that  is, 


approaches  100%  utilization.  This  limiting  action  motivates 


calling  such  a  subconfiguration  a  bottleneck;  note  that  there 
may  be  more  than  one  in  a  configuration.  For  a  given  collection 
of  subconfigurations  {;<„•}  which  are  a  partition  of  'jj  denote  the 
bottleneck  by  a  subscript  "b"  and  now  is  the  subconfiguration 

for  which 


Wd  ( S;  X„: ) 

pfsJxP 


is  a  maximum. 


VJ(S;xJ  WdlSsxJ 

Thus  nsjjp-  -  ““  (P7sHp-» 

Note  that  this  choice  is  software  unit  dependent,  for  in  a  con¬ 
figuration  supporting  multiple  software  units  it  is  possible 


Xb(S)  *  H(T)  for  S  *  T 

Now  as  a  single  interaction  makes  its  way  through  the  configura¬ 
tion,  we  have  that  the  total  time  spent  at  any  subconfiguration 
X„.  (denoted  here  by  Txd(S ;x  J  is: 

vd(SiXt) 

?xd(S;Xi)  =  Jfsjy-fr 

This  is  valid  regardless  of  how  many  separate  visits  are  made 
to  each  X,*  °r»  behalf  of  any  single  interaction  in  o’  (denoted  5.). 

Now  for  all  the  subconfigurations  X.,  the  response  time  is: 


k  Wd(S;X.)  k 

r b<S  iVmn  =  I  m  - y  -  l  ~xd(S  ;XJ 

Z=1  V=1 

and  the  corresponding  throughput  derived  using  Little's  Law  for 
.V  =  i  is: 

xols”nthf>‘/l  wd<SiXil 

i-1  P(S;\ .) 
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We  can  now  sketch  a  curve  for  throughput  as  in  Figure  6.1.1 


Figure  6.1.1 

Bottleneck  Analysis  -  Throughput 

One  asymptote  is  the  horizontal  line  for  the  bottleneck's  limit¬ 
ing  throughput  of 

Wd(S;xb)‘ 

The  other  asymptote  is  the  line  commencing  at  the  origin  and 
passing  through  l/Tb(S ;ty)  when  .V,  the  multiprogramming  load  is  1. 

The  approximate  throughput  function  commences  at  load  N  =  1  with 
corresponding  throughput  l/Tb(S^; ^^rrrin  risin9  monotonically  and 
staying  below  the  asymptote  with  slope  l/Tb(S  1;ty) anc*  approach¬ 
ing  (with  increasing  N)  the  horizontal  asymptote  drawn  through 
throughput  value 

P(S;xb ) 

Wd(S;xb)' 

The  load  value  N*  where  the  two  asymptotes  intersect  is  where 


(6.1.2 


.V* 


P(S;xJ 

Wd(S;xb) 


Tb(S,;iiJ 


rmn 


p(s;xb) 

Wd(S;xb) 


wd(S;x-' 


This  load  is  called  the  system  saturation  point  [KLEI76]  and  is 
given  significance  in  that  -V  >  .V*  implies  that  interactions  in 
the  system  are  causing  mutual  delays  through  queueing.  That  is 
.V*  is  exactly  the  maximum  number  of  perfectly  scheduled 
interactions  that  would  cause  no  mutual  interference.  This  is 
because  the  fraction  of  time  required  at  the  bottleneck  device 
for  a  single  interaction  compared  to  its  total  service  time  is 


Txd:S;xb) 

Tb(Sn;ty)  ■ 

1  rmn 


where 

Wd(S;xb) 

Tza(S;xb)  =  p(S;x ) 

0 

This  implies  that 

xb(S jjip) m£n  similar  interactions  could  be  scheduled  at  xb 
±xd(S;xb )  without  causing  interference  with  each  other, 

and  this  is  exactly  what  is  given  by  Equation  (6.1.2) 


Figure  6.1.2 

Bottleneck  Analysis  -  Response  Time 


As  one  dare  not  assume  that  such  perfect  scheduling  at  subconfigura¬ 
tions  is  in  fact  realized  even  for  7  <  '!* ,  the  curve  sketched  remains 
below  the  sloping  asymptote  and  the  throughputs  X/7b(S^; ’i,^rr-n  are 
not  achieved. 


A  similar  type  of  analysis,  but  in  terms  of  response  time,  for 
terminal  driven  systems  leads  to  a  sketch  as  in  Figure  6.1.2. 

As  before,  the  minimum  1-interaction  response  time  is  given  by: 


7b(S,;ty) 


mn 


<  Vd(  5;x,J 

tj  pisr^r 


and  this  is  directly  plotted  as  the  horizontal  asymptote.  Next 
we  recall  that  when  saturated,  the  bottleneck  subconfiguration 
Xj,  limits  throughput  to 

?(SiXb> 

. ,  - /g - r  when  y,  is  saturated. 

Jc(S;\b)  Ab 

From  this  and  the  Interactive  Response  Time  Law  we  have : 


M 


To(S1'^}  ~  x  (sj 


-  Z  >  M' 


Vd(S;xb J 
P(S;xb)~ 


=  Tb(S,sty) , 


(6.1.3) 


and  the  right-hand  side  of  the  inequality  is  the  equation  for  the 
slanting  asymptote  in  Figure  6.1.2. 


The  intersection  of  this  asymptote  with  the  A/-axis  is  at: 

P(S;xh) 


Mb  =  Z* 


Wd(S;xb)  Txd(S;xb) 


(6.1.4a) 


The  intersection  with  the  Tb(S^;'p)  .  asymptote  is  at: 

•l.  rmn 


Mb*  =  (Tb(S,;ip)  .  +  Z) 

1  rmn 


J. _ 

Txd(S;xh ) 


(6.1.4b) 


=  7*  +  Mb 


(6.1.4c) 
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The  significance  of  Mb*  is  like  that  of  .V*  for  batch  systems. 

Mb*  is  that  number  of  terminals  which  could  be  scheduled  without 
interference.  So  for  M  >  Mb*,  queueing  is  certain  within  the 
system. 


Improvement  of  the  bottleneck  device  uncovered  by  the  above 
methods  results  in  gains  in  system  performance  only  until  its 
value  of  Txd(S;x^)  decreased  to  the 


zd(S;x^)  = 


Wd  ( S;  x„- ) 


of  some  other  subconfiguration  in  the  system.  As  an  example, 
consider  a  central  subsystem  for  a  terminal  system  having  only 
a  cpu  and  disk,  and  suppose  the  following  data  are  collected: 


?(S;cpu)  =  18  Mu/s  F(S;disk)  =  100  Ku/s 

Wd(S;opu)  =  6  Mu  Wd(S;disk}  =  SO  Ku 

Think  time  Z  =  25  seconds 


From  the  above  data  we  have : 


Txd(S;cpu )  =  1/S  second  Txd ( S; disk )  =  1/2  second 


So  the  bottleneck  device  is  the  disk.  We  plot  the  response 
time  asymptotes  in  Figure  6.1.3a  by  first  computing  the  minimum 
response  time. 


?b(S7;4>) 


min 


k  Wd(S;x-) 

1  oTc - T~  ~  +  1/2  =  5/6  second 

i=l  ?f5;V 


The  response  time  asymptote  for  the  bottleneck  device  (the  disk) 
is  given  by: 


Tb(S2;ty)  =  M-Txd(Sjdisk) 


with  :V* 


1/2  +  1/3 


n  /9 


=  2.6? 


20 


m  ■  m  ‘  so 

Mb*  =  SO  +  1.67  -  SI.  7  terminals. 
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Figure  6.1.3a 

Bottleneck  Analysis  Example 
Disk  Bottleneck  Action 


We  should  realize  that  the  cpu  is  a  potential  system  bottleneck, 
that  is,  were  disk  power  sufficiently  increased,  the  cpu  would 
become  the  bottleneck.  This  more  complex  characterization  is 
depicted  in  Figure  5.1.3b.  Improvement  obtained  from  a  speedup 
of  the  disk  is  limited  by  the  potential  bottleneck  action  of  the 


Figure  6.1.3b 

Bottleneck  Analysis  Example 
Potential  Bottleneck  Action  of  the  CPU 
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6.2  STATE  SPACE  BALANCE  AND  THE  PRODUCT  FORM  SOLUTION 


Buzen  and  Denning  show  by  operational  methods  in  [BUZE77a]  that  the 
rates  of  transitions  in  and  out  of  system  states  satisfy  balance 
equations  similar  to: 


y  p  (W.  .)  (q  .  . 

—ij  t  J 


=  l  ?(W)l 

iyj  i 


prs;XJ 

+  qi0q0j)Wb(s,x j 

P(5,X;) 

M(S;xJ 


(6.2.1) 


For  all  Ww(S;i>)  and  i,j  = 

Where:  p(W^.  .)  is  the  proportion  of  Tx(S;ty)  that  the  system  is  in 

“'J 

the  state  defined  by: 


Wu(S;ty)..  =  lWw(S;X1)j  •••  ,Ww(S;Xj)  +  t'fo(S;X^  *  *  * ' 


and  where: 


Wiy(S;x-)  -  Vb(S;x.-),' 

ti  J 


WjSity)  =  [WufSixJ  '  ,Ww(S;X;)  ,•  •  •  ,k'u(S;Xj)  ”  1 

1  '  Ki 

I.  is  an  indicator  function  defined  by: 


I .=  0  when  Wu (S: Y .)  =  0 
v  v 

=  1  when  Ww  (Si Y . )  >  0 


They  further  show  that  these  equations  have  a  solution  of  the  form: 


p(W)  = 


nl  n2 

h  *9  * 


(6.2.2) 


where:  The  Y.  depend  only  on  workload  and  processor  parameters. 


The  are  the  numbers  of  requests  at  each  x„-  given  by 


WwfSix^ 

i  ~  Wb  ( S;  x„-  - 


n .  = 


V 
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ki 


and  G  is  a  normalization  constant  given  by: 


K  n . 

g  *  l  .n1Y.t 

By  observing  that  from  Equation  (5.3.2) 


Pf'f/.J  =  ~p(WJ 

‘j 

n  n  .+1  n  .-1 

since  p(W .  J  =  Y.  Y  ••• 

— ij  I  i'  J 

substitution  in  the  balance  equations  (6.2.1)  shows  that  the 
product  form  of  Equation  (6.3.2)  is  valid  if  and  only  if  equations 
like : 


P(S;x-)  k  ?(S;xi ) 

V  VbfSsx  .)  =  Yi  mS;x.)  (cij  +  qi0q0j} 


(6.2.3a) 


J  =  !,• •• 3k 


are  valid. 


Now  if  we  substitute  for  Y.  the  quantity 

'1* 


PCS;*.,*) 

PCS;xiT~ 


into  Equation  (6.2.3),  we  get: 


P(S;XjA)  <  PCSiXt^)  f 

vbcsixj)  =  >2  wb7sTxJ~(qij  +  qiOq03} 

If  we  add  the  additional  equation: 


(6.2.3b) 


P(S;x0^)  p(S;ii))  k  P(S;x.,H» 

Wb(S;X  )  ~  Wd( S;ip)  X.  Wb(S;x  J  qi0 

0  t=l 
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We  then  get 


- s< -  -  y  - ; - 

.  ,  '*b  ('S;x .) 

z=Q  At 


3  . 


Wd(S;ty) 


WdTJJxJTf) 

u 


and  when  each  side  is  multiplied  by 

r«k :s;x .) 


we  get 


Wb(S;X-)  k  Wb(S;xJ 

-wi<sj*rc-oi  *  l, 


ivb  ( 5;  x  • )  ~iic 


(6.2.3c) 


J  = 


But  these  are  just  the  earlier  Equations  (5.2.5),  the  Work 
Distribution  Node  Equations. 


Since  we  also  have  under  configuration  work  flow  balance  that 


PfSiX^W  P(S;x,,W 

WdTsJxTT  =  Wd ( S; x  ■> 

t  o 


all  i,Q  =  0,  •  •  *  tk 


P(S;x,A) 

1 .  =  -  is  a  solution  of  the  Equations  (6.2.3a), 

then 

P(S;XV^)  Wd(S;xi)  WdtSjx^ 

Yi  "  P(S;xi>  'plSiX^)  =  pTsTxTT 

is  also  a  solution. 
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wEBSEasm  it~"  ~i  in  i  inm 


Notice  that  since 


5 — —j UfSiXjtty)  ,  the  utilization,  then 

relative  utilization  of  x„*  to  X;>  a  fact  which  we 

V 

previous  bottleneck  analysis. 


z  .  , 

tt—  is  the 

used  in  our 


We  thus  have  a  solution  of  products  of  terms 

Wd(3;x,)  n. 


n . 


=  [ 


and  here  the  Wd(3jx „•)  are  parameters  of  the  workload  and  r'2;xd' 
are  parameters  of  the  subconfigurations.  The  n.  which  define 
a  system  state  depend  on  the  work  waiting  at  the  subconfigura¬ 
tion  X-  and  the  blocksize  work  of  an  implementation  parameter. 

’2' 


6.3  ALGORITHMS  FOR  COMPUTING  PERFORMANCE  QUANTITIES 

We  now  present  two  methodologies  for  the  computation  of  system 
throughputs  and  device  queue  lengths  and  utilizations  for  systems 
having  product  form  solutions.  Each  methodology  has  its  distinct 
point  of  view  and  each  is  presented  in  a  software  physics  formulation. 
The  first  is  based  on  work  by  Reiser  [REIS78] ,  and  is  developed 
from  intuitively  appealing  principles.  The  second,  is  an  adapta¬ 
tion  of  Buzen's  algorithm  [BUZE73]  applied  to  systems  where  the 
average  work  demand  increases  proportionately  to  the  number  of 
concurrent  interactions  in  the  system.  We  will  present  these  for 
the  case  of  single  job  classes  only,  although  their  bases  have 
been  extended  elsewhere  to  multiple  classes  [REIS78 ,ROOD78] . 

6.3.1  Mean  Value  Analysis  -  Fixed  Service/Workload  Parameters 

This  approach  to  performance  quantity  computation  depends  on 
these  principles: 

(1)  Upon  arrival,  an  interaction  "sees"  a  system  as  one  with 
itself  removed  in  long-term  equilibrium. 
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(2)  Little’s  Law  is  applied  to  the  system  as  a  whole  and  to 
individual  queues. 


The  first  principle  is  a  consequence  of  the  theorem  stated  by 
Reiser  in  [REIS78]  and  here  paraphrased: 


"In  a  closed  queueing  network  with  product-form  solution, 
the  probability  to  see  a  state  WJS; w)  upon  customer  arrival 
when  there  are  .7  interactions  in  a  system  is  the  same  as 
the  long  term  equilibrium  probability  of  W( 3 ;i)j  in  the 
system  with  7-1  interactions  present." 

,\ 

Letting  Tw(3 [7]  be  the  waiting  time  required  by  a  single 
visit  to  X„-  when  the  system  load  is  7,  we  write: 


[7] 

■j.  I * 


Wb(S;x_.)  Inji'SjX;)  [7-2] 

>  u 


(6.3.1) 


That  is,  the  waiting  time  consists  of  the  time  to  complete 

Wb(S;x units  of  work  from  the  arriving  request  plus  the 

Ww ( S; X,- ) [7-2]  units  of  work  waiting  when  the  interaction  arrives. 


Now,  the  interaction  S.  will  require 

Vd(S;XJ 

wbTsJxJT  separate  visits  to 

accumulated  waiting  time : 


so  we  write  for  the 


Wd(S; X • )  A 

MSiXi)  vn  = 

Wd(S;Xi )  7wfS;xJ[7-2] 

=  P(S;x,)  U  +  WbtsJxT) 


(6.3.2) 


This  equation  gives  the  accumulated  waiting  time  in  terms  of 
the  total  work  required  at  x^  by  the  interaction  Sj  over  its 
entire  life,  the  absolute  power  of  x_-  and  the  average  request 
count 

Ww(S;x [7-2] 

wbTsJ^p —  seen  upon  arrival- 
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We  now  sum  the  accumulated  waiting  time  over  all  to  ?et  tJie 
total  time  that  the  interaction  is  in  the  system,  either  in 
queues  or  in  service.  This  time  is  "busy"  time  at  the  configura' 
tion  level  so  we  write : 

k 

Tb(S.;x)  [-V]  =  iTxfo^x,)  PI  (6.3.3) 


Note  that  we  have  made  explicit  use  of  our  assumption  that  an 
interaction  is  present  only  at  a  single  subconfiguration  of  a 
partition  of  '!>  at  any  given  instant. 


We  now  apply  Little's  Law  to  the  system  having  ,7  similar  trans¬ 
actions  to  get  the  throughput  of  work: 


PfSstyJ  [iV] 


N-Wd.(S;\\>) 
Tb(S,;i>)  [iV] 


or  for  the  interaction  throughput : 


(6.3.4a) 


XQ(S )  im 


P(S;il>)  m  =  N 
Vd(Sjip)  ~  Tb(S,;^)  [7] 


(6.3.4b) 


The  average  work  waiting  (including  that  in  service)  at  a 
device  is  given  by  the  application  of  Little ' s  Law  at  that  sub¬ 
configuration  : 


[‘V] 

is 


Tw(S,;\.)  [iV]-?fS;x,.,i|h>  [7] 

J-  Is  Is 


First  multiplying  the  right  hand  side  by  1  in  the  form  of 


msixj  w, b(S;x ^ 
Vb(S;x. i)  MlsJZp 


we  get: 


Wu(S;x, Iff]  =  Wb(S;x-) 

is  Is 


P(S;x,,'\>)  [71 
Vd(5;x;)  ~~ 


Pj(3;x,)  [7] 


and  multiplying  the  right-hand  side  again  by  1  in  the  form  of 


Wd(S;'p ) 


and  recalling  that 


Wd(SjXj) 


Wd(5;x) 


=  P(SjX;,M  we  get: 
PfS;ipJ  [VI 


:vt?r5;XJm  =  Wb(S;xi)-r.d^~r-Pu(S;xi)  [V] 


(6.3.5a) 


Since  the  recursion  (6.3.1)  is  in  terms  of  the  average  request 
count 


VL'  (S;  x„. ) 


it  is  more  convenient  to  write: 


Ww(S;X-)  [V]  p(S-\i)J  r-Vl 
M(S;x-)  Wd(S Sip) 


(6.3.5b) 


Or  in  terms  of  the  interaction  throughput : 


Wu(Sjx- )  [ffi 

Wb(s?x 7~  m  x0(s) 


(6.3.5c) 


Finally  we  derive  the  utilization  of  directly  from: 

P(S;X;A)  [.V] 

U(S;xi^)  «  P(S;x-) 


p(S'\  ipj  rvi 

Wl  wd(s.xj 

P(S-y  ) 

‘  'J  V  Wd(S;x ,J 

prs.-^m  wd(SiH} 

Wd(S;ip)  PfSixd 

Or  in  terms  of  interaction  throughput: 

Wd(S;x ■) 

u(s;x^)  =  V^-pTsTTT" 


(6.3.6a) 


(6.3.6b) 
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tii 


=  0.2  seconds 


So  the  disk  is  the  bottleneck  device.  The  asymptotic  system 
throughput  (determined  by  the  disk)  is: 

X.(S)  =  1/0.2  =  5  interactions/ second 

0  max 

The  minimum  (1  -  interaction)  response  time  is: 

Tb(Sxii)  .  =0.15  +  0.2  =  0.55  seconds  , 

1  ,7 nn 

so  the  1  -  interaction  throughput  is : 

l/Tb(S ,:$)  ■  =  1/0.35  =  2.36  interactions/ second 

1  mm 

Finally,  the  system  saturation  point  is: 

y*  -  P(S;disk)  r  Wd(S;Xi) 

J  Wd(S;disk)  L.  P(S;x.) 

=  1/0.2  (0.15  +0.2)  =  1.75 

The  resulting  throughput  function  is  sketched  in  Figure  6.3.2 


Figure  6.3.2 

Central  System  Throughput  -  Bottleneck  Analysis 


We  now  apply  the  algorithms  of  the  previous  section  to  compute 
the  exact  performance  quantities.  We  start  the  recursion  by 
setting  Ww(S;x  •)  [<?]  —  0.  The  process  and  results  are  summarized 

"Is 

in  Table  6.3.1.  Note  by  observing  the  device  utilization  columns 
in  the  table  that,  as  expected,  it  is  the  disk  which  shows  the 
highest  utilization  at  each  level  of  multiprogramming  load. 

The  interaction  throughput  function  X.(S)  [iV]  is  plotted  in 

u 

Figure  6.3.3  and  realizes  in  exact  form  the  approximating 
asymptotic  sketch  of  Figure  6.3.2. 
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CPU  DISK  CONFIGURATION 


TABLE  6. 


CENTRAL  SYSTEM  THROUGHPUT  -  MEAN  VALUE  ANALYSIS 


6.3.2  Buzen's  Algorithm  With  Overhead  Performance  Degradation 


In  [BUZE73]  Buzen  demonstrates  efficient  algorithms  for  the  com¬ 
putation  of  the  normalizing  constant,  G,  in  the  product  form 
solution.  In  a  subsequent  paper  [BUZE77]  he  shews  how  G  computed 
at  various  levels  of  multiprogramming  load  denoted  by  G  [.'/]  are 


The  ’i)  are  the  arbitrary  load  dependent  factors  which  divide 
the 

«d(S;xr) 

Yr  =  P(S;XJ 

to  express  the  variation  of  service  time  at  each  load  i. 

Now  suppose  that  for  our  special  case  we  have  that : 

Wd(S;x-)  r^V]  =  (l+D)Wd(S;X;)  [X-l]  (6.3.11) 

N  =  2,3,'“ 

where  D  is  a  constant  factor  of  increase  in  interaction  work 
demand  for  each  increment  of  multiprogramming  load  over  one. 

Defining  the  interaction  demand  execution  time: 

wd(sjx-)  un 

Txd(S;xJm  =- 

and  letting  Txd (S;x •)  12 ]  be  more  simply  denoted  as  Txd(S;Xj)  > 

"Z* 

we  have: 

Txd(SiXj)  in]  =  Txd(S;x-)'(l+D)n~2 

th 

Now  the  m —  term  in  the  sum  in  Equation  (6.3.10)  is: 

[Txd(S;xJ  1CT*  (1+D)m~1  •  (1+D)m~2“  •  (1+D)m~m 

which  we  may  rewrite  as: 

[Txd(S;x .)]m(l+D)S(m~1)  where  S(k)  «  £  i 

Z  i=l 

Figure  6.3.4  conceptualizes  the  recursion  in  a  two  dimensional 
array.  Each  column  represents  a  computation  with  parameters  for 
a  different  device.  The  zero—  element  in  each  column  contains 
a  "one."  There  is  a  zero—  column  with  entries  of  zero  for  loads 
1  through  N  to  start  the  recursion.  One  proceeds  by  computing 
g(n,r)  by  progressing  down  a  column  beginning  with  r=l.  The 
terms  summed  as  indicated  give  g(n,r)  and  the  final  column  has 
the  elements  that  correspond  to  the  elements  G[„V]  of  Equations 
(6.3.7,  6.3.8,  6.3.9) . 

-62- 


DEVICES 


Figure 


6. 3. 2.1  Tne  Example  System  -  Continuation 


We  can  now  incorporate  overhead  work  demand  increases  according 
to  the  scheme  just  developed  into  the  system  of  Section  6. 3. 1.1. 
Recall : 


=  3.. 


.o  seacnas 


„  ,  *d(S;cvu) 

Txd(o;cx>u)  =  - - 

P(o;cpu) 

Wd(S;disk)  „  „ 

m  7.  ,  ,  —  =3.2  seacnas 

Txd(b;dzsk)  =  r(S;avsi<) 


For  the  disk  we  take  a  constant  5%  increase  in  overhead  work 
for  each  increment  in  load.  For  the  cpu  we  take  possible  increases 
to  be  5%,  10%,  and  15%  and  examine  each  case.  The  resulting 
predicted  throughputs  are  given  in  Table  6.3.2.  These  are 
plotted  in  Figure  6.3.  5  and  show  what  is  usual  for  systems  with 
multiprogramming  load  -  increasing  overheads,  the  throughput 
function  achieves  a  maximum  and  then  starts  a  decline  for 
further  increases  in  load. 


6. 3. 2. 2  A  Graphical  Decomposition  Solution 

We  now  return  to  our  example  system  as  originally  given;  a 
collection  of  terminals  interacting  with  the  central  subsystem 
just  analyzed.  Courtois  [COUR75]  has  observed  that  systems 
are  nearly  completely  decomposable  into  groups  of  smaller  sub¬ 
systems  if  the  state  changes  between  the  subsystems  occur  at 
a  rate  much  slower  than  those  within  the  subsystems.  Thus, 
observing  that  the  rate  of  interaction  submission  from  the 
terminals  for  M  signed-on  terminals  is : 


where  N  is  the  number  of  interactions  in  process 
in  the  central  subsystem, 


and  assuming  that  the  maximum  submission  rate,  given  by  M/Z , 
is  much  less  than  the  rate  of  state  transitions  within  the 
central  subsystem,  we  can  treat  the  composite  as  the  "joining" 
of  two  equivalent  components  each  with  characteristics  separately 
determined  as  a  function  of  the  load  N.  A  useful  graphical 
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method  for  this  joining  process  is  conceptually  shown  in 
Figure  6.3.6.  A  typical  central  subsystem  throughput  function 
is  plotted  and  its  intersections  with  the  terminal  subsystem 
interaction  submission  function 


Figure  6.3.6 

Decomposition  Analysis  -  Conceptual 


Each  of  these  points  is  one  where  interaction  submission  rate 
equals  the  interaction  throughput  (completion  rate)  of  the 
central  subsystem.  However,  of  the  three  solutions  only  .'Ja  and 
are  stable,  that  is,  these  points  have  the  property  of  attract¬ 
ing  the  subsystem  from  neighboring  values  of  N.  The  point  .Vi, 
though  a  solution,  does  not  have  this  property.  To  show  this, 
first  consider  a  stable  point  such  as  Na.  If  the  number  of 
interactions  in  the  central  subsystem  should  increase  to  a 
value  Na  +  An,  the  central  subsystem  responds  with  an  increased 
throughput  which  exceeds  the  submission  rate  and  thus  tends  to 
return  the  subsystem  interaction  count  to  Va.  Similarly,  a 
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decrease  to  Na  -  In  is  met  with  a  lowering  of  throughput  below 
the  submission  rate.  This  results  in  increased  congestion 
raising  the  subsystem  interaction  count  back  towards  :7a.  At 
Nb ,  however,  an  increase  to  No  +  An  is  met  with  a  decrease 
of  throughput  thus  causing  further  and  further  congestion  until 
stability  is  reached  at  No.  A  decrease  to  .V b  -  An  is  responded 
to  with  increased  throughput,  driving  the  subsystem  towards  .7 a. 

Now  we  would  like  to  stablize  the  systems  at  a  point  like  No. 
to  the  left  of  the  maximal  throughput  in  the  interest  of  mini¬ 
mizing  the  response  time  observed  by  a  terminal  user.  This 
time,  given  by  an  application  of  Little's  Law,  is: 

Tirs^>  *  ysx 

where  %n(S)  is  the  throughput  at  Na . 

The  effects  of  varying  the  number  of  signed-on  terminals,  X, 
in  our  example  can  now  be  observed  by  indicating  with  lines  or 
tick  marks,  the  intersection  of  the  lines 

M  -  N 

-  with  the  central  subsystem  throughput  functions . 

7 

This  is  done  in  Figure  6.3.7  and  we  make  several  observations 
based  on  the  number  of  terminals  signed-on  and  the  cpu  degrada¬ 
tion  factor  which  is  operative.  Specifically  for  this  example: 

(i)  For  40  terminals,  there  is  no  intersection,  Na ,  with 

any  of  the  throughput  functions  such  that  the  intersec¬ 
tion  is  to  the  left  of  the  peak  throughput  value. 

(ii)  For  35  terminals  there  are  such  intersections  only  for 
the  cases  where  the  cpu  degradation  is  0.05  or  0.10. 

If  we  were  Uncertain  that  the  cpu  degradation  was  within 
these  limits,  we  should  consider  the  choice  of  this 
number  of  terminals  to  be  served  to  be  at  risk  of  not 
providing  acceptable  response  time. 


(iii)  For  the  other  values  of  M  shown  (28,  30,  32)  we  have 
all  three  cpu  degradation  factors  providing  solution 
points,  Na,  to  the  left  of  the  maximal  throughputs. 

So  we  are  at  the  least  risk  of  ending  up  stabilized 
at  a  solution  point  like  Nc  of  the  conceptual  discussion 
corresponding  to  a  low  throughput  and  relatively  large 
number  of  interactions  in  the  system. 

The  response  times  corresponding  to  the  solution  points  to  the 
left  of  maximum  throughput  are  given  by: 

Tb(si^  -  I^SJ 

and  are  summarized  in  Table  6.3.3  below. 


M 

(Terminals) 

DEGR(cpu) 

XJS)  Xa 

0 

=  0.05 

TbfS^) 

DEGR(cpu)  =0.1 

XQ(S)  Na 

DEGR(cpu) 

XQ(S)  Na 

=  0.15 

Tb  (S2;y\>) 

28 

3.5 

1.45 

0.44 

3.3 

1.55 

0.47 

3.3 

1.6 

0.48 

30 

3.55 

1.8 

0.51 

3.55 

1.8 

0.51 

3.5 

2.0 

0.57 

32 

3.75 

2.2 

0.59 

3.75 

2.2 

0.59 

3.65 

2.6 

0.71 

35 

3.95 

3.4 

0.86 

3.9 

3.6 

0.92 

- 

- 

- 

Table  6.3.3 
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